Networks uncover hidden lexical borrowing in Indo-European language evolution
نویسندگان
چکیده
Language evolution is traditionally described in terms of family trees with ancestral languages splitting into descendent languages. However, it has long been recognized that language evolution also entails horizontal components, most commonly through lexical borrowing. For example, the English language was heavily influenced by Old Norse and Old French; eight per cent of its basic vocabulary is borrowed. Borrowing is a distinctly non-tree-like process--akin to horizontal gene transfer in genome evolution--that cannot be recovered by phylogenetic trees. Here, we infer the frequency of hidden borrowing among 2346 cognates (etymologically related words) of basic vocabulary distributed across 84 Indo-European languages. The dataset includes 124 (5%) known borrowings. Applying the uniformitarian principle to inventory dynamics in past and present basic vocabularies, we find that 1373 (61%) of the cognates have been affected by borrowing during their history. Our approach correctly identified 117 (94%) known borrowings. Reconstructed phylogenetic networks that capture both vertical and horizontal components of evolutionary history reveal that, on average, eight per cent of the words of basic vocabulary in each Indo-European language were involved in borrowing during evolution. Basic vocabulary is often assumed to be relatively resistant to borrowing. Our results indicate that the impact of borrowing is far more widespread than previously thought.
منابع مشابه
Early Phonological and Lexical Development of a Farsi Speaking Child: A Longitudinal Case Study
The present study aims at the description and analysis of the phonological and lexical development of a child who is acquiring Farsi as his first language. The child's language production at the holophrastic stage of language development, mainly single words, is observed and recorded longitudinally for nearly seven months since he was 16 months old until he turned 23 months. An attempt is mad...
متن کاملThe shape and tempo of language evolution.
There are approximately 7000 languages spoken in the world today. This diversity reflects the legacy of thousands of years of cultural evolution. How far back we can trace this history depends largely on the rate at which the different components of language evolve. Rates of lexical evolution are widely thought to impose an upper limit of 6000-10,000 years on reliably identifying language relat...
متن کاملFrom Words to Dates: Water into Wine, Mathemagic or Phylogenetic Inference?
Gray & Atkinson’s (2003) application of quantitative phylogenetic methods to Dyen, Kruskal & Black’s (1992) IndoEuropean database produced controversial divergence time estimates. Here we test the robustness of these results using an alternative data set of ancient Indo-European languages. We employ two very different stochastic models of lexical evolution – Gray & Atkinson’s (2003) finite-site...
متن کاملAncestry-constrained Phylogenetic Analysis Supports the Indo-european Steppe Hypothesis
Discussion of Indo-European origins and dispersal focuses on two hypotheses. Qualitative evidence from reconstructed vocabulary and correlations with archaeological data suggest that IndoEuropean languages originated in the Pontic-Caspian steppe and spread together with cultural innovations associated with pastoralism, beginning c. 6500–5500 bp. An alternative hypothesis, according to which Ind...
متن کاملThe semantic structure of sensory vocabulary in an African language
The widespread occurrence of ideophones, large classes of words specialized in evoking sensory imagery, is little known outside linguistics and anthropology. Ideophones are a common feature in many of the world’s languages but are underdeveloped in English and other Indo-European languages. Here we study the meanings of ideophones in Siwu (a Kwa language from Ghana) using a pile-sorting task. T...
متن کامل